Overview

Dataset Statistics

Number of Variables 22
Number of Rows 253680
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 23899
Duplicate Rows (%) 9.4%
Total Size in Memory 42.6 MB
Average Row Size in Memory 176.0 B
Variable Types
  • Categorical: 18
  • Numerical: 4

Dataset Insights

MentHlth and PhysHlth have similar distributions Similar Distribution
BMI is skewed Skewed
MentHlth is skewed Skewed
PhysHlth is skewed Skewed
Dataset has 23899 (9.42%) duplicate rows Duplicates
HeartDiseaseorAttack has constant length 3 Constant Length
HighBP has constant length 3 Constant Length
HighChol has constant length 3 Constant Length
CholCheck has constant length 3 Constant Length
Smoker has constant length 3 Constant Length
Stroke has constant length 3 Constant Length
Diabetes has constant length 3 Constant Length
PhysActivity has constant length 3 Constant Length
Fruits has constant length 3 Constant Length
Veggies has constant length 3 Constant Length
HvyAlcoholConsump has constant length 3 Constant Length
AnyHealthcare has constant length 3 Constant Length
NoDocbcCost has constant length 3 Constant Length
GenHlth has constant length 3 Constant Length
DiffWalk has constant length 3 Constant Length
Sex has constant length 3 Constant Length
Education has constant length 3 Constant Length
Income has constant length 3 Constant Length
MentHlth has 175680 (69.25%) zeros Zeros
PhysHlth has 160052 (63.09%) zeros Zeros
  • 1
  • 2
  • 3

Variables


HeartDiseaseorAttack

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (0.0) is over 9.62 times larger than the second largest value (1.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 0.0
3rd row 0.0
4th row 0.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • The largest value (00) is over 9.62 times larger than the second largest value (10)
  • HeartDiseaseorAttack has words of constant length

HighBP

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 1.0
2nd row 0.0
3rd row 1.0
4th row 1.0
5th row 1.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • HighBP has words of constant length

HighChol

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 1.0
2nd row 0.0
3rd row 1.0
4th row 0.0
5th row 1.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • HighChol has words of constant length

CholCheck

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (1.0) is over 25.79 times larger than the second largest value (0.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 1.0
2nd row 0.0
3rd row 1.0
4th row 1.0
5th row 1.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (1.0, 0.0) take over 50.0%
  • The largest value (10) is over 25.79 times larger than the second largest value (00)
  • CholCheck has words of constant length

BMI

numerical

Approximate Distinct Count 84
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 4058880
Mean 28.3824
Minimum 12
Maximum 98
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • BMI is skewed right (γ1 = 2.122)

Quantile Statistics

Minimum 12
5-th Percentile 20
Q1 24
Median 27
Q3 31
95-th Percentile 40
Maximum 98
Range 86
IQR 7

Descriptive Statistics

Mean 28.3824
Standard Deviation 6.6087
Variance 43.6748
Sum 7.2e+06
Skewness 2.122
Kurtosis 10.9972
Coefficient of Variation 0.2328
  • BMI is not normally distributed (p-value 2.7510179714794805e-09)
  • BMI has 9847 outliers

Smoker

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 1.0
2nd row 1.0
3rd row 0.0
4th row 0.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • Smoker has words of constant length

Stroke

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (0.0) is over 23.65 times larger than the second largest value (1.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 0.0
3rd row 0.0
4th row 0.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • The largest value (00) is over 23.65 times larger than the second largest value (10)
  • Stroke has words of constant length

Diabetes

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (0.0) is over 6.05 times larger than the second largest value (2.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 0.0
3rd row 0.0
4th row 0.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (0.0, 2.0) take over 50.0%
  • The largest value (00) is over 6.05 times larger than the second largest value (20)
  • Diabetes has words of constant length

PhysActivity

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (1.0) is over 3.11 times larger than the second largest value (0.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 1.0
3rd row 0.0
4th row 1.0
5th row 1.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (1.0, 0.0) take over 50.0%
  • The largest value (10) is over 3.11 times larger than the second largest value (00)
  • PhysActivity has words of constant length

Fruits

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (1.0) is over 1.73 times larger than the second largest value (0.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 0.0
3rd row 1.0
4th row 1.0
5th row 1.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (1.0, 0.0) take over 50.0%
  • The largest value (10) is over 1.73 times larger than the second largest value (00)
  • Fruits has words of constant length

Veggies

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (1.0) is over 4.3 times larger than the second largest value (0.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 1.0
2nd row 0.0
3rd row 0.0
4th row 1.0
5th row 1.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (1.0, 0.0) take over 50.0%
  • The largest value (10) is over 4.3 times larger than the second largest value (00)
  • Veggies has words of constant length

HvyAlcoholConsump

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (0.0) is over 16.79 times larger than the second largest value (1.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 0.0
3rd row 0.0
4th row 0.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • The largest value (00) is over 16.79 times larger than the second largest value (10)
  • HvyAlcoholConsump has words of constant length

AnyHealthcare

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (1.0) is over 19.43 times larger than the second largest value (0.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 1.0
2nd row 0.0
3rd row 1.0
4th row 1.0
5th row 1.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (1.0, 0.0) take over 50.0%
  • The largest value (10) is over 19.43 times larger than the second largest value (00)
  • AnyHealthcare has words of constant length

NoDocbcCost

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (0.0) is over 10.88 times larger than the second largest value (1.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 1.0
3rd row 1.0
4th row 0.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • The largest value (00) is over 10.88 times larger than the second largest value (10)
  • NoDocbcCost has words of constant length

GenHlth

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 5.0
2nd row 3.0
3rd row 5.0
4th row 2.0
5th row 2.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (2.0, 3.0) take over 50.0%
  • GenHlth has words of constant length

MentHlth

numerical

Approximate Distinct Count 31
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 4058880
Mean 3.1848
Minimum 0
Maximum 30
Zeros 175680
Zeros (%) 69.2%
Negatives 0
Negatives (%) 0.0%
  • MentHlth is skewed right (γ1 = 2.7211)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 2
95-th Percentile 26
Maximum 30
Range 30
IQR 2

Descriptive Statistics

Mean 3.1848
Standard Deviation 7.4128
Variance 54.9503
Sum 807913
Skewness 2.7211
Kurtosis 6.4415
Coefficient of Variation 2.3276
  • MentHlth is not normally distributed (p-value 1.0639628218958111e-24)
  • MentHlth has 36208 outliers

PhysHlth

numerical

Approximate Distinct Count 31
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 4058880
Mean 4.2421
Minimum 0
Maximum 30
Zeros 160052
Zeros (%) 63.1%
Negatives 0
Negatives (%) 0.0%
  • PhysHlth is skewed right (γ1 = 2.2074)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 3
95-th Percentile 30
Maximum 30
Range 30
IQR 3

Descriptive Statistics

Mean 4.2421
Standard Deviation 8.718
Variance 76.0027
Sum 1.0761e+06
Skewness 2.2074
Kurtosis 3.4961
Coefficient of Variation 2.0551
  • PhysHlth is not normally distributed (p-value 2.3001923313344183e-24)
  • PhysHlth has 40949 outliers

DiffWalk

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (0.0) is over 4.94 times larger than the second largest value (1.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 1.0
2nd row 0.0
3rd row 1.0
4th row 0.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • The largest value (00) is over 4.94 times larger than the second largest value (10)
  • DiffWalk has words of constant length

Sex

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 0.0
3rd row 0.0
4th row 0.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • Sex has words of constant length

Age

numerical

Approximate Distinct Count 13
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 4058880
Mean 8.0321
Minimum 1
Maximum 13
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Age is skewed left (γ1 = -0.3599)

Quantile Statistics

Minimum 1
5-th Percentile 2
Q1 6
Median 8
Q3 10
95-th Percentile 13
Maximum 13
Range 12
IQR 4

Descriptive Statistics

Mean 8.0321
Standard Deviation 3.0542
Variance 9.3283
Sum 2.0376e+06
Skewness -0.3599
Kurtosis -0.5812
Coefficient of Variation 0.3803
  • Age is not normally distributed (p-value 9.282968327804983e-06)

Education

categorical

Approximate Distinct Count 6
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (6.0) is over 1.54 times larger than the second largest value (5.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 4.0
2nd row 6.0
3rd row 4.0
4th row 3.0
5th row 5.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (6.0, 5.0) take over 50.0%
  • The largest value (60) is over 1.54 times larger than the second largest value (50)
  • Education has words of constant length

Income

categorical

Approximate Distinct Count 8
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17250240
  • The largest value (8.0) is over 2.09 times larger than the second largest value (7.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 3.0
2nd row 1.0
3rd row 8.0
4th row 6.0
5th row 4.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 507360
  • The top 2 categories (8.0, 7.0) take over 50.0%
  • The largest value (80) is over 2.09 times larger than the second largest value (70)
  • Income has words of constant length

Interactions

Correlations

Missing Values